skip to main content


Search for: All records

Creators/Authors contains: "Sinsheimer, Janet S."

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract

    For species of management concern, accurate estimates of inbreeding and associated consequences on reproduction are crucial for predicting their future viability. However, few studies have partitioned this aspect of genetic viability with respect to reproduction in a group-living social mammal. We investigated the contributions of foundation stock lineages, putative fitness consequences of inbreeding, and genetic diversity of the breeding versus nonreproductive segment of the Yellowstone National Park gray wolf population. Our dataset spans 25 years and seven generations since reintroduction, encompassing 152 nuclear families and 329 litters. We found more than 87% of the pedigree foundation genomes persisted and report influxes of allelic diversity from two translocated wolves from a divergent source in Montana. As expected for group-living species, mean kinship significantly increased over time but with minimal loss of observed heterozygosity. Strikingly, the reproductive portion of the population carried a significantly lower genome-wide inbreeding coefficients, autozygosity, and more rapid decay for linkage disequilibrium relative to the nonbreeding population. Breeding wolves had significantly longer lifespans and lower inbreeding coefficients than nonbreeding wolves. Our model revealed that the number of litters was negatively significantly associated with heterozygosity (R = −0.11). Our findings highlight genetic contributions to fitness, and the importance of the reproductively active individuals in a population to counteract loss of genetic variation in a wild, free-ranging social carnivore. It is crucial for managers to mitigate factors that significantly reduce effective population size and genetic connectivity, which supports the dispersion of genetic variation that aids in rapid evolutionary responses to environmental challenges.

     
    more » « less
  2. Kelso, Janet (Ed.)
    Abstract Motivation Current methods for genotype imputation and phasing exploit the volume of data in haplotype reference panels and rely on hidden Markov models (HMMs). Existing programs all have essentially the same imputation accuracy, are computationally intensive and generally require prephasing the typed markers. Results We introduce a novel data-mining method for genotype imputation and phasing that substitutes highly efficient linear algebra routines for HMM calculations. This strategy, embodied in our Julia program MendelImpute.jl, avoids explicit assumptions about recombination and population structure while delivering similar prediction accuracy, better memory usage and an order of magnitude or better run-times compared to the fastest competing method. MendelImpute operates on both dosage data and unphased genotype data and simultaneously imputes missing genotypes and phase at both the typed and untyped SNPs (single nucleotide polymorphisms). Finally, MendelImpute naturally extends to global and local ancestry estimation and lends itself to new strategies for data compression and hence faster data transport and sharing. Availability and implementation Software, documentation and scripts to reproduce our results are available from https://github.com/OpenMendel/MendelImpute.jl. Supplementary information Supplementary data are available at Bioinformatics online. 
    more » « less
  3. Ndeffo Mbah, Martial L (Ed.)
    The SARS-CoV-2 pandemic led to closure of nearly all K-12 schools in the United States of America in March 2020. Although reopening K-12 schools for in-person schooling is desirable for many reasons, officials understand that risk reduction strategies and detection of cases are imperative in creating a safe return to school. Furthermore, consequences of reclosing recently opened schools are substantial and impact teachers, parents, and ultimately educational experiences in children. To address competing interests in meeting educational needs with public safety, we compare the impact of physical separation through school cohorts on SARS-CoV-2 infections against policies acting at the level of individual contacts within classrooms. Using an age-stratified Susceptible-Exposed-Infected-Removed model, we explore influences of reduced class density, transmission mitigation, and viral detection on cumulative prevalence. We consider several scenarios over a 6-month period including (1) multiple rotating cohorts in which students cycle through in-person instruction on a weekly basis, (2) parallel cohorts with in-person and remote learning tracks, (3) the impact of a hypothetical testing program with ideal and imperfect detection, and (4) varying levels of aggregate transmission reduction. Our mathematical model predicts that reducing the number of contacts through cohorts produces a larger effect than diminishing transmission rates per contact. Specifically, the latter approach requires dramatic reduction in transmission rates in order to achieve a comparable effect in minimizing infections over time. Further, our model indicates that surveillance programs using less sensitive tests may be adequate in monitoring infections within a school community by both keeping infections low and allowing for a longer period of instruction. Lastly, we underscore the importance of factoring infection prevalence in deciding when a local outbreak of infection is serious enough to require reverting to remote learning. 
    more » « less
  4. Abstract Background

    Statistical geneticists employ simulation to estimate the power of proposed studies, test new analysis tools, and evaluate properties of causal models. Although there are existing trait simulators, there is ample room for modernization. For example, most phenotype simulators are limited to Gaussian traits or traits transformable to normality, while ignoring qualitative traits and realistic, non-normal trait distributions. Also, modern computer languages, such as Julia, that accommodate parallelization and cloud-based computing are now mainstream but rarely used in older applications. To meet the challenges of contemporary big studies, it is important for geneticists to adopt new computational tools.

    Results

    We present , an open-source Julia package that makes it trivial to quickly simulate phenotypes under a variety of genetic architectures. This package is integrated into our OpenMendel suite for easy downstream analyses. Julia was purpose-built for scientific programming and provides tremendous speed and memory efficiency, easy access to multi-CPU and GPU hardware, and to distributed and cloud-based parallelization. is designed to encourage flexible trait simulation, including via the standard devices of applied statistics, generalized linear models (GLMs) and generalized linear mixed models (GLMMs). also accommodates many study designs: unrelateds, sibships, pedigrees, or a mixture of all three. (Of course, for data with pedigrees or cryptic relationships, the simulation process must include the genetic dependencies among the individuals.) We consider an assortment of trait models and study designs to illustrate integrated simulation and analysis pipelines. Step-by-step instructions for these analyses are available in our electronic Jupyter notebooks on Github. These interactive notebooks are ideal for reproducible research.

    Conclusion

    The package has three main advantages. (1) It leverages the computational efficiency and ease of use of Julia to provide extremely fast, straightforward simulation of even the most complex genetic models, including GLMs and GLMMs. (2) It can be operated entirely within, but is not limited to, the integrated analysis pipeline of OpenMendel. And finally (3), by allowing a wider range of more realistic phenotype models, brings power calculations and diagnostic tools closer to what investigators might see in real-world analyses.

     
    more » « less
  5. Abstract

    The availability of vast amounts of longitudinal data from electronic health records (EHRs) and personal wearable devices opens the door to numerous new research questions. In many studies, individual variability of a longitudinal outcome is as important as the mean. Blood pressure fluctuations, glycemic variations, and mood swings are prime examples where it is critical to identify factors that affect the within‐individual variability. We propose a scalable method, within‐subject variance estimator by robust regression (WiSER), for the estimation and inference of the effects of both time‐varying and time‐invariant predictors on within‐subject variance. It is robust against the misspecification of the conditional distribution of responses or the distribution of random effects. It shows similar performance as the correctly specified likelihood methods but is 103∼ 105times faster. The estimation algorithm scales linearly in the total number of observations, making it applicable to massive longitudinal data sets. The effectiveness of WiSER is evaluated in extensive simulation studies. Its broad applicability is illustrated using the accelerometry data from the Women's Health Study and a clinical trial for longitudinal diabetes care.

     
    more » « less
  6. Abstract

    Logistic regression is the primary analysis tool for binary traits in genome‐wide association studies (GWAS). Multinomial regression extends logistic regression to multiple categories. However, many phenotypes more naturally take ordered, discrete values. Examples include (a) subtypes defined from multiple sources of clinical information and (b) derived phenotypes generated by specific phenotyping algorithms for electronic health records (EHR). GWAS of ordinal traits have been problematic. Dichotomizing can lead to a range of arbitrary cutoff values, generating inconsistent, hard to interpret results. Using multinomial regression ignores trait value hierarchy and potentially loses power. Treating ordinal data as quantitative can lead to misleading inference. To address these issues, we analyze ordinal traits with an ordered, multinomial model. This approach increases power and leads to more interpretable results. We derive efficient algorithms for computing test statistics, making ordinal trait GWAS computationally practical for Biobank scale data. Our method is available as a Julia packageOrdinalGWAS.jl. Application to a COPDGene study confirms previously found signals based on binary case–control status, but with more significance. Additionally, we demonstrate the capability of our package to run on UK Biobank data by analyzing hypertension as an ordinal trait.

     
    more » « less
  7. Abstract

    Aggression is a quantitative trait deeply entwined with individual fitness. Mapping the genomic architecture underlying such traits is complicated by complex inheritance patterns, social structure, pedigree information and gene pleiotropy. Here, we leveraged the pedigree of a reintroduced population of grey wolves (Canis lupus) in Yellowstone National Park, Wyoming, USA, to examine the heritability of and the genetic variation associated with aggression. Since their reintroduction, many ecological and behavioural aspects have been documented, providing unmatched records of aggressive behaviour across multiple generations of a wild population of wolves. Using a linear mixed model, a robust genetic relationship matrix, 12,288 single nucleotide polymorphisms (SNPs) and 111 wolves, we estimated the SNP‐based heritability of aggression to be 37% and an additional 14% of the phenotypic variation explained by shared environmental exposures. We identified 598 SNP genotypes from 425 grey wolves to resolve a consensus pedigree that was included in a heritability analysis of 141 individuals with SNP genotype, metadata and aggression data. The pedigree‐based heritability estimate for aggression is 14%, and an additional 16% of the phenotypic variation was explained by shared environmental exposures. We find strong effects of breeding status and relative pack size on aggression. Through an integrative approach, these results provide a framework for understanding the genetic architecture of a complex trait that influences individual fitness, with linkages to reproduction, in a social carnivore. Along with a few other studies, we show here the incredible utility of a pedigreed natural population for dissecting a complex, fitness‐related behavioural trait.

     
    more » « less